Perception of synthesized voice quality in connected speech by Cantonese speakers.

نویسندگان

  • Edwin M L Yiu
  • Bruce Murdoch
  • Kathryn Hird
  • Polly Lau
چکیده

Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cultural and language differences in voice quality perception: a preliminary investigation using synthesized signals.

BACKGROUND Perceptual voice evaluation is a common clinical tool. However, to date, there is no consensus yet as to which common quality should be measured. Some available evidence shows that voice quality is a language-specific property which may be different across different languages. The familiarity of a language may affect the perception and reliability in rating voice quality. AIMS The ...

متن کامل

Title Effects of Cultural and Linguistic Backgrounds on Perceptual Voice Quality Rating Effects of Cultural and Linguistic Backgrounds on Perceptual Voice Quality Rating

Perceptual voice judgment is a standard procedure in voice quality evaluation. However, the cultural and linguistic backgrounds of the judges often affect its reliability. Use of anchors has been shown to improve the reliability of the procedure. Therefore, this study aimed to develop female synthesized anchors in Cantonese, English, and Putonghua and investigate the reliability of perceptual v...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

Hemispheric Asymmetry in Processing Low - and High - pass Filtered

In auditory perception, a right hemisphere (RH)/left ear advantage for low-pass filtered stimuli and a left hemisphere (LH)/right ear advantage for high-pass filtered stimuli have been reported. Here we investigated how tonal language experience modulates this hemispheric asymmetry. We recruited Cantonese, Mandarin (tonal languages), and English (non-tonal language) speakers, and asked them to ...

متن کامل

Speech performance of adult cantonese-speaking laryngectomees using different types of alaryngeal phonation.

The purpose of the present study was to compare the speech performance of four types of alaryngeal phonation-electrolaryngeal (EL), pneumatic artificial laryngeal (PA), tracheoesophageal (TE), and standard esophageal (SE) speech-by adult Cantonese-speaking laryngectomees. Subjective ratings of (1) voice quality, (2) articulation proficiency, (3) quietness of speech, (4) pitch variability, and (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Journal of the Acoustical Society of America

دوره 112 3 Pt 1  شماره 

صفحات  -

تاریخ انتشار 2002